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Abstract 

There has been much recent interest in the satisfiability of random Boolean formulas. A 
random fc-SAT formula is the conjunction of m random clauses, each of which is the disjunction 
of fc literals (a variable or its negation). It is known that when the number of variables n is 
large, there is a sharp transition from satisfiability to unsatisfiability; in the case of 2-SAT this 
happens when m/n — > 1, for 3-SAT the critical ratio is thought to be m/n ~ 4.2. The sharpness 
of this transition is characterized by a critical exponent, sometimes called v — Vk (the smaller 
the value of v the sharper the transition). Experiments have suggested that — 1.5 ± 0.1, 
v± = 1.25 ± 0.05, v$ = 1.1 ± 0.05, v§ = 1.05 ± 0.05, and heuristics have suggested that V}. — > 1 
as fc — > oo. We give here a simple proof that each of these exponents is at least 2 (provided the 
exponent is well-defined) . This result holds for each of the three standard ensembles of random fc- 
SAT formulas: m clauses selected uniformly at random without replacement, m clauses selected 
uniformly at random with replacement, and each clause selected with probability p independent 
of the other clauses. We also obtain similar results for q-colorability and the appearance of a 
g-core in a random graph. 

1. Introduction 

In the past decade many researchers have studied the satisfiability of random Boolean formulas, in 



an attempt to understand the "average case" of NP-complete problems. See [15[ for a survey. Let 
n denote the number of Boolean variables. A literal is either a Boolean variable or its negation. A 
fc-clause is the OR (disjunction) of k literals whose underlying variables are all distinct. A random 
fc-SAT formula is the AND (conjunction) of m uniformly random fc-clauses. A formula is satisfiable 
if there is an assignment to the Boolean variables for which the formula evaluates to TRUE. For 
random 3-SAT it has been observed empirically that there is a critical value 03 ~ 4.2 such that 
when n is large and m/n < 03 — e, the formula is nearly always satisfiable, while if m/n > 03 + e, 
the formula is nearly always unsatisfiable. Furthermore, determining whether or not a formula is 
satisfiable appears to be the hardest when the ratio m/n is about 03 |23]j . Similar phenomenona 
occur for other values fc, except that for k = 2 the formulas are always easy (deterministically). 
Consequently there have been many empirical as well as rigorous studies of this transition from 
satisfiable to unsatisfiable. 



It is known rigorously [12] that the SAT-to-UNSAT transition is sharp, i.e. that at some critical 
ratio of m/n the probability of satisfiability rapidly drops from close to 1 to close to 0. But for 
fc > 2 it has not been proved that the critical ratio of m/n tends to a constant, as opposed to 
being a slowly varying function of n that oscillates between its known lower and upper bounds of 
Q.og2)^ t - 1 - (log 2 + l)/2 - o fc (l) § and (log2)2 fe §. (When k = 3, the tighter bounds of 3.42 
Hi and 4.506 fill are known.) 

One basic feature of the SAT-to-UNSAT transition is its characteristic width. This width is the 
amount A by which m needs to be increased for the probability of satisfiability to drop from 2/3 
to 1/3, or more generally, to drop from 1 — e to e. The characteristic width is thought to grow as a 
polynomial in n, so that A = 0(n 1_1//y ), where the constant hidden by the 0() depends on e, but 
the critical exponent v does not. (Using v to denote this critical exponent is a rather unfortunate 
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choice of notation, since in statistical mechanics v refers to a related but different critical exponent, 
see e.g. [|14|, Chapter 7]. The exponent for the width would be denoted 2 — a, but we use v in 
this paper to facilitate comparison with earlier studies of the fc-SAT transition.) It is not obvious 
a priori that v is well-defined, as it could in principle slowly oscillate with n or depend upon e. 
For 2-SAT it was proved recently [|| that the characteristic width does in fact grow polynomially 
in n, and that v = 3. There have been a number of experimental studies aimed at measuring the 
critical exponent v for random fc-SAT, as summarized in the above table, and several authors have 
conjectured that the exponent v tends to 1 as k gets large. The purpose of this note is to provide 
a simple proof that for each fixed k, the characteristic width is always at least 0(n 1 / 2 ), so that in 
particular, if the exponent v is well-defined, it is always at least 2. 

Remark: There is a related ensemble of random fc-SAT formulas, in which each possible fc-clause 
appears in the formula with probability p independently of the other clauses. For convenience we 
let M denote the total number 2 fc (^) of possible clauses. When pM ~ m, this J- ntP ensemble of 
random formulas will behave much like the T n , m ensemble of formulas defined above. But there 
is a limit on how sharp the SAT-to-UNSAT transition can be for the T np ensemble, due to the 
approximate relationship between p and the number of clauses in the formula. Even if pM = m, 
the number of clauses will be m ± 0(m 1 / 2 ). It is thus straightforward to show that in the T nv 
ensemble, the critical exponent v must be at least 2. (This bound is closely related to the "Harris 
criterion" for disordered statistical mechanical systems p| 0.) The (by now) standard proof of 
v > 2 for the J- niP ensemble depends on the variance in the number of clauses, which is zero for 
the T n ,m ensemble. It is simple to define properties on sets of clauses such that there is a much 
sharper transition in the T nm ensemble, with a smaller value of v < 2 — one trivial example is the 
property "m < 4.2n", for which v = 1 in J- n m while v = 2 in J- n ,p- Until now it has been suggested 
(on the strength of Monte Carlo experiments and heuristic arguments) that satisfiability is one 
such property. Despite this, we proceed to show that even for the T n ,m ensemble, the characteristic 
width is at least G(n 1 / 2 ), so that v (if it is well-defined) must be at least 2. 

There are also a number of questionable conjectures about other features of random /c-SAT 
formulas. For instance, for 3-SAT Crawford and Auton [10] study the value of m = mi/2(n), where 
m r (n) denotes the smallest value of m for which the fraction of satisfiable /c-SAT formulas is < r. 
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They fit rxii^iji) to a curve of the form roiw(n) = a^n + An 1 ' 1 ^ , obtaining v = 1 Q and then 
later v = 3/5 flcfl . (The integrality of m by itself strongly suggests that v > 1.) While it is in 
principle conceivable that mx^in) = a 3 n + 0(1), it is an easy consequence of our work that there 
can be at most one value of r for which m r (n) = a^n + o(n 1 / 2 ) (otherwise the SAT-to-UNSAT 
transition would be too sharp). For 2-SAT the special value of r is empirically about 91%, and for 

3- SAT there is no reason to believe (and experimental reason not to believe) that the special value 
of r is 1/2. 

Selman and Kirkpatrick p8| estimated exponents for the characteristic width of the median 
computational difficulty of determining whether or not a random formula is satisfiable, where 
computational difficulty was measured in terms of the number of recursive calls made by Crawford 
and Auton's SAT-solver (Tableau). The exponents they obtained were 1.3 for 3-SAT, 1.25 for 

4- SAT, and 1.1 for 5-SAT. We do not analyze the specific SAT-solver Tableau, but we can say 
something about the characteristic width of the computationally difficult problems for other SAT- 
solvers. Many SAT-solvers use the "pure literal rule" before starting a backtracking search for a 
satisfying assignment. A SAT-solver using this rule will look for literals y in the formula such that 
y's negation y does not also appear in the formula. If such a literal y exists, then the SAT-solver 
sets y to TRUE and removes from the formula any clauses containing y, since the resulting simpler 
formula is satisfiable if and only if the original formula was satisfiable. Rigorously analyzing the 
median computational difficulty seems not so easy, but using our methods one can show that for 
these SAT-solvers the typical computational difficulty has critical exponent at least 2. By this we 
mean the following: if for m clauses there is probability p that the number of recursive calls is 
between L and U, then when there are m + A clauses, the probability is p + 0(A/^/n) + o(l). 

Our method is general enough to be applicable to other types of sharp transitions. For instance, 
Pittel, Spencer, and Wormald prove that there is a sharp transition for the appearance of a 
q-coie in a random graph. (The q-core is the maximal subgraph for which each vertex has degree 
at least q.) They prove that the width of this transition is at most n l l 2+0 ( l \ but gave no lower 
bound. We prove that the width is at least B(n 1 / 2 ). (Independent of this present work, Kirkpatrick 
[19] has reported that experiments suggest that the width is 0(n 1//2 ).) We can also supply lower 
bounds on the transition width of other graph properties, such as g-colorability. 



2. Proofs of theorems 

We now give a proof that v > 2 in J- n m that works simultaneously for fc-SAT, g-colorability, the 
existence of a g-core, and a variety of other properties. Let M denote the number of possible items. 
In the case of fc-SAT, the items will be the possible clauses on n variables, and M = 2 k (™) . In the 
case of the g-core or g-colorability, the items will be the possible edges of a graph on n vertices, 
and M = (g)- A property classifies sets of items into two types: sets which have the property 
(for convenience call them proper sets), and sets which do not have the property (improper sets). 
Satisfiability is a property on sets of clauses, g-colorability is a property on sets of edges, and the 
existence of a g-core of a graph is also a property on sets of edges. We may also be interested 
in non-monotone properties, such as the property that a certain SAT-solver does more than L 
recursive calls to determine the satisfiability of a set of clauses. 

A bystander rule is a way of partitioning a set of items into two classes: the relevant items, and 
the bystander items. A bystander rule must satisfy the following constraint. Given a set of items 
A, let R be the relevant items, and let G be the bystander items. Then for any B C A, it must 
be that B has the property if and only if B \ G has the property. In this sense, the only relevant 
items for the property are those that are contained in R — that is, if one restricts attention to sets 
of items contained in A, the bystander items never affect the property. 
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In the case of fc-SAT, we will use the "partially-free" bystander rule, which declares a clause 
to be a bystander if the underlying variable of one of its literals does not appear anywhere else 
in the formula. (Recall that the formula is the AND of the clauses in the set.) By setting this 
variable to an appropriate value, the clause can be satisfied without affecting our ability to satisfy 
the remaining clauses of the formula. Thus partially-free is in fact a valid bystander rule. 

For the q-coie and g-colorability, we also use the "partially-free" bystander rule, which in the 
context of graphs (and hypergraphs) declares an edge to be a bystander if one of its endpoints has 
degree 1. It is simple to check that partially- free is a valid bystander rule for these properties. 

Remark: We could in principle use instead other bystander rules. One possibility is to declare 
any clause that eventually gets resolved by repeated application of the pure literal rule to be a 
bystander. But it is easier to analyze the partially-free bystander rule, and our objective to provide 
a simple rigorous proof. 

Theorem 1. Suppose that a property has a bystander rule such that, in a set of m random items, 
with probability 1 — e at least 7m of the items are bystanders. (Items may be chosen either without 
or with replacement.) Suppose further that a set of mi < m random items is proper with probability 
pi, and a set of m 2 < m random items is proper with probability p2- If (3 < m\jm < 1 — (3, and 
(3 < 7712/771 < 1 — (3, then 



\mi - m 2 \ > (\ Pl - P2 \ - e) V2^ ^ P ^_^\ l - o(l)), 

where the o(l) term becomes small if m gets large while (3 and 7 remain fixed. 

Informally, Theorem p] says that if there are many bystanders, then the transition cannot be 
too sharp. To prove this we use the following lemma: 

Lemma 2. Suppose there are m balls, of which jm are green and (1 — 7)777 are red, and that (3m 
of these balls are randomly sampled without replacement. The probability that the ((3m) th ball is 
red and exactly £ of the sampled balls are red is, as a function of £, is unimodal and at most 

l + o(l) / 1-7 
V2^d \ 113(1 - (3)' 

Here the o(l) term becomes small when the expected number of sampled red balls, sampled green 
balls, unsampled red balls, and unsampled green balls are each large. 

We postpone the proof of this lemma to § ^, and proceed to the more interesting part of the proof. 

Proof of Theorem Let C±, . . . , Cm denote the items. If the items are selected without replace- 
ment, let a be a uniformly random permutation on the numbers 1, . . . , M. If the items are selected 
with replacement, let a be an i.i.d. sequence of uniformly random integers in the range 1, . . . , M. Let 
f m denote the sequence consisting of the first m items with respect to a, i.e. f m = (C a n\ , . . . , C a ( m -) ) ; 
f m is a uniformly random sequence of m items chosen without (resp. with) replacement. Let g m be 
the number of bystander items, and r m = m — g m the number of relevant items of f m . Say that £ is 
a positively (resp. negatively) critical integer if the set of the first £ relevant items of f m is proper 
(resp. improper) but the first £ — 1 relevant items is improper (resp. proper). Pick a uniformly 
random permutation r on the numbers 1, . . . , m (independent of a). Use r to tag a random set of 
b items from f m , i.e. C a t T i\\\, . . . , C a ^ T ^y, the tagged items form a random set of b items chosen 
without (resp. with) replacement, since we could have picked r first and then a. Suppose that L 
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of the tagged items are relevant. Since our sequence of items f m is already in a random order, we 
may instead pick and keep the first L relevant items of f m and the first b — L bystander items of 
f m . The resulting set f m & of kept items is a uniformly random set of 6 items chosen without (resp. 
with) replacement, and whether or not f m ^ has the property is determined by L. We can write 

Pr[/ m ,& is proper|/ m ] - Pr[/ TO>6 _i is proper|/ m ] = Pr[/ mj6 is proper and f m ,b-l is improper|/ m ] 

- Pr[/ m ,6 is improper and f m ,b-i is proper|/ m ] 
# relevant tags is positively critical, 



Pr 



6th tag is relevant 



fn 



Pr 



# relevant tags is negatively critical, 



|/n 



6th tag is relevant 

Now we use the fact that the negatively critical integers and the positively critical integers are 
interleaved, and that 

f(£) = Pr[# relevant tags is £, 6th tag is relevant|/ TO ] 

is unimodal in £ (from Lemma ^). Let the critical integers be £\, £2, ■ ■ ■ , £ c , and suppose that of 
these, £a maximizes /(). Then we can write 



and 



£(-i)«-'7(4) = m) -[/(VO - /Ml - • • • 
i=1 -[/(VO-ZMl-- 

c 

^(-i) i-M /(^) = W -/(W) + - /(Vs)] + 

-/(Vi) + [/(V2)-/Wl + 

>/(g-/(Ui)-/(Vi)>-/(«' 



Thus 



Pr[/ mi6 is proper|/ m ] - Pr[/ mj6 _i is proper]/,, 



< maxPr 

t 

< l + o(l 



# relevant tags is 
6th tag is relevant 



fn 



r m /m 



V2vrm y (g m /m)(b/m)(l - b/m) 
by Lemma |2[ and then assuming g m > 7m and f3m < 6 < (1 — /?)m we get 

l + o(l) 



< 



1-7 



^2^ra V 7/3(1 - /3) 



where the o(l) term becomes small if m gets large while (3 and 7 remain fixed. Thus if both mi 
and m2 are between /3m and (1 — (3m) we can write 



Pr[/ m ,mi is proper] 
Pr[/ m , m2 is proper] 



< 



\Pi - P2\ <\mi - m 2 



Pr[/ m has at least 7m bystanders] x \mi — rri2 

+ Pr[/ m has less than 7m bystanders] 
l + o(l) 



1-7 



l + o(l) 

V2^m~ V 7/3(1 - 0) 



1-7 

V 7/3(1 - /?) 



+ E. 



□ 
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To apply Theorem |T| to k-S AT, we need to show that many clauses are bystanders, and to apply 
it to the (/-core and (/-colorability thresholds, we need to show that many edges are bystanders. 

Lemma 3. Suppose that the items are k-clauses, edges, or hyperedges on k vertices. Assume 
that k is fixed, and that m = 0(n) random items are selected, either with replacement or without 
replacement. With high probability there will be (1 + o(l))m[l — [1 — e - fcm / n ] fc ] partially-free items. 

This lemma says that the number of partially-free clauses or hyperedges is with high proba- 
bility close to what one might naively expect. Since we use the lemma to disprove a number of 
experimental results and heuristic arguments, we give a careful proof of it in § |3[ But first let us 
see how to use the lemma with Theorem [l]. 

Corollary 4. Let p\ and p2 be fixed numbers such that 1 > p\ > p2 > 0. Suppose that a random 
3- SAT formula with n variables and m\ clauses is satisfiable with probability > p\ and a random 
3-SAT formula with n variables and m<i clauses is satisfiable with probability < pi- Then 

m2 — mi > (0.0015 + o(l)) x (pi - p2) x yjn, 

where the o(l) goes to when n — > oo. In particular, 1/3 > 2 if it is well-defined. Similarly, for 
k-SAT in general (k fixed), we get rri2 — mi > (pi — P2)®(V n ~), implying > 2. 

Proof. For 3-SAT, when n is large, we know that mi/n and m2/n are both close to the critical ratio 
03(71), where 3.42 < 03(71) < 4.571. (Details of the 4.506 upper bound are not available at this time, 
so here we use the established bound of 4.571 [jl8|.) More generally for A:-SAT, we know that for 
i = 1 or 2, Cfc — o(l) < mi/n < Cfc + o(l), where and c/% are lower and upper bounds on the critical 
&-SAT ratio c^(n) (which could conceivably be a function of n). Let m = n(cfe(n) + i), where t is a 
positive constant that we will choose in a moment. By Lemma ||, the fraction of clauses which are 
partially-free is w.h.p. 7 = 1 - [1 - e - fc ( c fc(")+*)] fc + (i). The value for Theorem [l] is t/(ck{n) +t). 
Note that 7 is monotone decreasing in c/%(n), and that m(3{\ — 0) = ntck(n) / (ck(n) +t) is monotone 
increasing in Cfc(n), so that our bound from Theorem [l] is at least as good as 

\mi ~ m 2 \ >(\ P i- P2 \ - o(l)) v^-^ {l _ e - m+t)]k (1 " "(I)) 
= ipi -p2)0{Vn). 

For 3-SAT we take t = 0.3 to get the above-stated constant of 0.0015. □ 

For the appearance of a q-core in a random graph, there is a sharp threshold, and furthermore 
the precise values of the critical ratio m/n = c q are known. It is known e.g. that C3 ~ 3.35, 
C4 w 5.14, C5 f« 6.81, and = k + yj k log k + 0{\ogk) p7| . We can lower bound the characteristic 
width for the appearance of the (/-core in essentially the same that we did for /c-SAT, except that 
here k = 2 even as q varies. (A larger value of k would correspond to the appearance of the (/-core 
within a fc-uniform hypergraph.) 

Corollary 5. For q > 3, the transition for existence of a q-core has characteristic width > 0(y / ra). 

When one randomly adds edges one a time, w.h.p. the (/-core jumps from size to size Q(n) with 
the addition of a single edge [^] — Corollary || is a statement about the timing of this jump. 

With (/-colorability, it is known that there is a sharp threshold |l|] for the number of edges that 
a random graph can have while still being (/-colorable, but as with A:-SAT, it is not known that c q 
is a bona fide constant rather than a slowly varying function of n that oscillates between its known 
upper and lower bounds. Luczak proved proved that c q /(q log q) — > 1 as q — > 00, and it is known 
that 2.01 < C3 < 2.495 [||] fi~7|| . As with the (/-core, here k = 2 even as q varies, and we have 

Corollary 6. For q > 3, the transition for q- colorability has characteristic width at least Q(^/n). 
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3. Proofs of lemmas 

Proof of Lemma |j. For convenience let g = 7m be the number of green balls, r = (1 — j)m be the 
number of red balls, and b = /3m be the number of balls selected. The precise probability that the 
6th ball is the £th red ball is 

(1) (b-e) T> 

(?) ' 

The ratio of successive terms is 

0(A)f (?) i(g-b + £ + l) 



which is monotone in £ and implies the unimodality claim. Next we identify the mode: 

£(g-b + £ + l) > (r-£){b-£) 
£(g -b+l + b + r)>rb 

£> rb 
~ r+g+l 

so that the optimal £ is given by £ = \rb/(r + g + 1)] , which is within 1 of rb/(r + g). 

We next approximate the maximum value of this probability For convenience let A = Ijm. 
Recall Stirling's formula: n\ = n n e _n v / 2vrnexp[l/(12n + S n )] where < S n < 1. 

G){ b -e)l J v ((l- 7 )m)!(7m)!G9m)!((l-/3)m)! 

X 



(™) 6 (Am)!((l-7-A)m)!((/?-A)m)!(( 7 -/3 + A)m)!m! 



A 

'1 X 



(1 _ 7 )l-7 7 7^(l - 



(A) A (1 - 7 - A) 1 -^ A (/5 - A^- A ( 7 - /3 + A)^+ A _ 
1 / (I" 7)7/3d xexp(o(1)); 



v^f^ y A(l - 7 - A)(/3 - A) (7 - /3 + A) 

Consider the exp(o(l)) error term arising from the exp[l/(12ra + 8 n )\ portion of Stirling's formula. 
Since £ < r, r — £ < m — b, b — £ < b, and g — b + £ < g, the error term will be < 1, so we may 
drop it to get an upper bound. If the second term on the right were larger than 1, then we could 
increase m while keeping the ratios g/m, £/m, and b/m fixed, and thereby make the probability as 
large as we like, and in particular larger than 1. Thus we can drop this term as well: 



ie){b-e)l < 1 / (l- 7 )7A(l-/3) 



y (3(1 - 7 - X)(j3 - A) (7 — /3 + X) 
and upon substituting A = (1 — 7)/? ± 1/m we find 



< 1 (1- 7)7(1 - 1 )f3(l-(3) 

- V m - 7)(i - 0))(7/?)(7(i - /?)) 1 Uj 



^\/^h X(1 + ° (1)) ' 

where the o(l) vanishes when /3(1 — 7) 3> 1/m, /?7 3> 1/m, (1 — — 7) 3> 1/m, and (1 — /3)7 S> 
1/m. □ 
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Remark: One referee suggested an alternative to Lemma |3|, which has worse constants but a 
much shorter proof. The proof of the alternative lemma focuses on "nearly free" variables/ vertices 
rather than clauses/edges, and uses standard large deviation inequalities for martingales. We give 
here the original proof since it requires less background. 

Proof of Lemma |j. Let r = 2 if the items we are interested in are clauses of k Boolean variables, 
and let r = 1 if the items are edges of a graph (A; = 2) or A;-uniform hypergraph (k > 2). Recall 
that n is the number of variables or vertices. In each of these cases, the number M of possible items 
is M = f k {^)- Recall our assumption that k is fixed and that we are looking at sets of m = 0(n) 
items, so that functions of k and m/n may be written as 0(1). A more detailed analysis could 
determine what happens when e.g. k — * oo with n, but we do not attempt this. 

Say that an item is ci-free (1 < d < k), with respect to a set of items, if the first d variables (if 
the item is a clause) or the first d vertices (if the item is an edge or hyperedge) do not occur in any 
other items in the set. 

The probability that an item is d-free when the m items are randomly selected without replace- 
ment is easily seen to be 

( rk ("l d )) rn-2 ^ k (n-d)(n-d-l)-(n-d-k+l) _ . 
m—1 I _ TT ' fc! ' 



Prfitem is d-free] = — - = Y\ 
We use the identity 



11 r fc (n)(n-l)-(»-fc+l) 



m— 1 



fc! 



a 1 — a/b 

d 



b-5 b b-S 



to estimate each term in the product: 



Mj^h _ i _ {n _ d)k 1 _ {n _ d)k/{n)k ^ 1 



„/ c Wfc_^_ 1 (n) k M-(i + l) v ' M-(i + l) 

_{n-d) k (l + o(iP/ !o(m) + 0(1/tf) 



(n) k M 



{n - d) k 



exp[0(l/M)] = exp[-dk/n + 0(l/n 2 )] 



(When sampling is done with replacement, the exp[0(l/M)] error term does not appear.) From 
this we see that the probability that an item in the set is d-free is exp[— (1 + o{l))dkm/n\. Thus 
the expected number of d-free items is mexp[- (1 + o{\))dkm/n\. We wish to show that the actual 
number of ci-free items will likely be close to its expected value, so we bound the variance. 

Let be the indicator random variable for item C being d-free. For C ^ C, when sampling 
is done without replacement we have 



Pr [items C and C d-free] = Pr 



first d variables/vertices of C 
.are not in C &: vice versa 



m-2 / 



m-2 ) 



2dk -d 2 + 0(1) \ Tr 3 r k (n-2d)(n-2d-l)...(n-2d-k+l) 



I 

II / * ' r k (n)(n-l)-(n-k+l) 



o ' ~ 



In the same manner as above we estimate each term in the product: 

r fc _ . {n _ ^ 



r k ^r-i-2 ( n )k 



■exp[0(l/M)]. 
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(As before, when sampling with replacement, the exp[0(l/M)] error term does not appear.) Thus 
1 _ (2kd -d 2 + o(l))/n ({n- 2d) k {n) k {n) k " "'~ 2 



E\X<$X<§] 



E[xg>]E[X£i 



(1 - (fcd + o(l))/n) 2 



1 d 2 + o(l) 



(n) fc (n - (n - d) fe 

m-2 



exp(0(m/n' c )) 



n 



t-r (n - 2d - j)(n - j) 
11 ( n _d- 



= 1 + 



d 2 + o(i; 



n 



(n - d - j)(n - d - j) 



exp(0(m/n fc )) 



m-2 



n(l-(l + o(l))p)j exp(0( 



m/n k )) 



E[xg*X$] = E[Xg>]E[Xg>] + 0(l/n) 
Cov(4 d) ,4?) =0(l/n), 

yielding the variance in the number of <i-free items 

=m(m-l)Cav(x^,X^ 



l + 0(l/n) 
(«0i 



Var 



E4" 



m Var ( X, 



.(d) 



O(m). 



Using Chebychev's inequality, it follows that the actual number of <i-free items will with high 
probability be within 0(y/rn) of its expected value mexp[— (1 + o(l))dkm/n]. 

Recall that an item is partially free if at least one of its variables/ vertices is not contained in 
any of the other items. We use inclusion-exclusion to estimate the number of partially free items. 
Let Xq denote the event that item C is partially free, and X^ , '" ,Jd denote the event that item C 
is free in positions ji, . . . ,jd- Then 

X+ = l- £ 

SC{l,...,k} 

£x+ = ™- E (-d #5 E*c- 

SC{l,...,fc} 



c 



a 



Since there are 2 fc = 0(1) possible values of S, and for each one J2c^c 1S wr ^ n m § n probability 
within 0{y/m) of mexp[- (1 + o(i))(#S)km/n], the number of partially free clauses is with high 
probability within 0(\/m) of 



m- E (-l) #5 ^exp[-(l + o(l))(#S)fcm/n] =m(l + o(l)) 

SC{l,...,fc} 



1 - E f nN ) (-l) d e~ dfon / n 



=m(l + o(l))[l - [1 - e -Wn]fe]_ 



□ 
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